Acta Psychologica Sinica ›› 2026, Vol. 58 ›› Issue (3): 416-436.doi: 10.3724/SP.J.1041.2026.0416
Previous Articles Next Articles
ZHOU Lei1, LI Litong1, WANG Xu1, OU Huafeng1, HU Qianyu1, LI Aimei2, GU Chenyan1
Received:2025-05-12
Published:2026-03-25
Online:2025-12-26
CLC Number:
ZHOU Lei, LI Litong, WANG Xu, OU Huafeng, HU Qianyu, LI Aimei, GU Chenyan. (2026). Large language models capable of distinguishing between single and repeated gambles: Understanding and intervening in risky choice. Acta Psychologica Sinica, 58(3), 416-436.
Add to citation manager EndNote|Ris|BibTeX
URL: https://journal.psych.ac.cn/acps/EN/10.3724/SP.J.1041.2026.0416
| [1] Achiam J., Adler S., Agarwal S., Ahmad L., Akkaya I., Aleman F. L., .. McGrew B. (2023). GPT-4 technical report. [2] Aher G. V., Arriaga R. I., & Kalai A. T. (2023). Using large language models to simulate multiple humans and replicate human subject studies. In [3] Altay S., Hacquin A. S., Chevallier C., & Mercier H. (2023). Information delivered by a chatbot has a positive impact on COVID-19 vaccines attitudes and intentions. [4] Anderson M. A. B., Cox D. J., & Dallery J. (2023). Effects of economic context and reward amount on delay and probability discounting. [5] Argyle L. P., Busby E. C., Fulda N., Gubler J., Rytting C.,& Wingate, D.(2023). Out of one, many: Using language models to simulate human samples. [6] Arora C., Sayeed A. I., Licorish S., Wang F., & Treude C. (2024). Optimizing large language model hyperparameters for code generation. [7] Barberis N.,& Huang, M.(2009). Preferences with frames: A new utility specification that allows for the framing of risks. [8] Benartzi, S., & Thaler, R. H. (1999). Risk aversion or myopia? Choices in repeated gambles and retirement investments. Management Science, 45(3), 364-381 [9] Binz, M., & Schulz, E. (2023). Using cognitive psychology to understand GPT-3. [10] Brandstätter E., Gigerenzer G., & Hertwig R. (2006). The priority heuristic: Making choices without trade-offs. [11] Brislin, R. W. (1986). The wording and translation of research instruments. In W. J. Lonner & J. W. Berry (Eds.), [12] Brown T., Mann B., Ryder N., Subbiah M., Kaplan J. D., Dhariwal P.,… Amodei, D.(2020). Language models are few-shot learners. [13] Carvalho T., Negm H., & El-Geneidy A. (2024). A comparison of the results from artificial intelligence-based and human-based transport-related thematic analysis. [14] Chen Y., Liu T. X., Shan Y., & Zhong S. (2023). The emergence of economic rationality of GPT. [15] Choi S., Kang H., Kim N., & Kim J. (2025). How does artificial intelligence improve human decision-making? Evidence from the AI-powered Go program. Strategic Management Journal, 46(6), 1523-1554. https://doi.org/10.1002/smj.3694 [16] Christensen, R. H. B. (2023). [17] Coda-Forno J., Witte K., Jagadish A. K., Binz M., Akata Z., & Schulz E. (2023). Inducing anxiety in large language models can induce bias. [18] Dai S. C., Xiong A., & Ku L. W. (2023). LLM-in-the-loop: Leveraging large language model for thematic analysis. [19] de Kok, T. (2025). ChatGPT for textual analysis? How to use generative LLMs in accounting research. [20] de Varda A. G., Saponaro C., & Marelli M. (2025). High variability in LLMs’ analogical reasoning. [21] DeepSeek-AI, Guo D., Yang D., Zhang H., Song J., Zhang R., … Zhang Z. (2025). Deepseek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning. [22] Deiana G., Dettori M., Arghittu A., Azara A., Gabutti G., & Castiglia P. (2023). Artificial intelligence and public health: Evaluating ChatGPT responses to vaccination myths and misconceptions. [23] Deiner M. S., Honcharov V., Li J., Mackey T. K., Porco T. C., & Sarkar U. (2024). Large language models can enable inductive thematic analysis of a social media corpus in a single prompt: Human validation study. [24] Demszky D., Yang D., Yeager D. S., Bryan C. J., Clapper M., Chandhok S., .. Pennebaker J. W. (2023). Using large language models in psychology. [25] Dillion D., Tandon N., Gu Y.,& Gray, K.(2023). Can AI language models replace human participants? [26] Ding Y., Zhang L. L., Zhang C., Xu Y., Shang N., Xu J., Yang F., & Yang M. (2024). Longrope: Extending LLM context window beyond 2 million tokens. [27] Faul F., Erdfelder E., Lang A. G., & Buchner A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. [28] Ferguson S. A., Aoyagui P. A., & Kuzminykh A. (2023). Something borrowed: Exploring the influence of AI-generated explanation text on the composition of human explanations. In [29] Goli A.,& Singh, A.(2024). Frontiers: Can large language models capture human preferences? [30] Grossmann I., Feinberg M., Parker D. C., Christakis N. A., Tetlock P. E., & Cunningham W. A. (2023). AI and the transformation of social science research. [31] Gupta R., Nair K., Mishra M., Ibrahim B.,& Bhardwaj, S.(2024). Adoption and impacts of generative artificial intelligence: Theoretical underpinnings and research agenda. [32] Hagendorff T., Fabi S., & Kosinski M. (2023). Human-like intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT. [33] Hebenstreit K., Praas R., Kiesewetter L. P., & Samwald M. (2024). A comparison of chain-of-thought reasoning strategies across datasets and models. [34] Hertwig R.,& Erev, I.(2009). The description-experience gap in risky choice. [35] Jiao L., Li C., Chen Z., Xu H., & Xu Y. (2025). When AI “possesses” personality: Roles of good and evil personalities influence moral judgment in large language models. [焦丽颖, 李昌锦, 陈圳, 许恒彬, 许燕. (2025). 当AI“具有”人格: 善恶人格角色对大语言模型道德判断的影响. [36] Jin H. J.,& Han, D. H.(2014). Interaction between message framing and consumers’ prior subjective knowledge regarding food safety issues. [37] Jones, E., & Steinhardt, J. (2022). Capturing failures of large language models via human cognitive biases. [38] Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. [39] Karinshak E., Hu A., Kong K., Rao V., Wang J., Wang J., & Zeng Y. (2024). LLM-globe: A benchmark evaluating the cultural values embedded in LLM output. [40] Karinshak E., Liu S. X., Park J. S., & Hancock J. T. (2023). Working with AI to persuade: Examining a large language model's ability to generate pro-vaccination messages. Proceedings of the ACM on Human-Computer Interaction, 7(CSCW1), 1-29. https://doi.org/10.1145/3579592 [41] Katz A., Fleming G. C., & Main J. (2024). Thematic analysis with open-source generative AI and machine learning: A new method for inductive qualitative codebook development. [42] Kelton A. S., Pennington R. R., & Tuttle B. M. (2010). The effects of information presentation format on judgment and decision making: A review of the information systems research. [43] Khalid, M. T., & Witmer, A. P. (2025). Prompt engineering for large language model-assisted inductive thematic analysis. [44] Kumar, A., & Lim, S. S. (2008). How do decision frames influence the stock investment choices of individual investors? [45] Lehr S. A., Caliskan A., Liyanage S., & Banaji M. R. (2024). ChatGPT as research scientist: Probing GPT’s capabilities as a research librarian, research ethicist, data generator, and data predictor. Proceedings of the National Academy of Sciences, 121(35), e2404328121. https://doi.org/10.1073/pnas.2404328121 [46] Lenth, R. V. (2025). [47] Li, S. (2004). A behavioral choice model when computational ability matters. [48] Lim S.,& Schmälzle, R.(2024). The effect of source disclosure on evaluation of AI-generated messages. [49] Lin, Z. (2023). Why and how to embrace AI such as ChatGPT in your academic life. [50] Lin, Z. (2024). How to write effective prompts for large language models. [51] Lin, Z. (2025). Techniques for supercharging academic writing with generative AI. [52] Liu N., Zhou L., Li A. M., Hui Q. S., Zhou Y. R.,& Zhang, Y. Y.(2021). Neuroticism and risk-taking: the role of competition with a former winner or loser. [53] Liu S. X., Yang J. Z.,& Chu, H. R.(2019). Now or future? Analyzing the effects of message frame and format in motivating Chinese females to get HPV vaccines for their children. [54] Lopes L. L.(1996). When time is of the essence: Averaging, aspiration, and the short run. [55] Lu J., Chen Y.,& Fang, Q.(2022). Promoting decision satisfaction: The effect of the decision target and strategy on process satisfaction. [56] Mei Q., Xie Y., Yuan W., & Jackson M. O. (2024). A turing test of whether AI chatbots are behaviorally similar to humans. [57] Mischler G., Li Y. A., Bickel S., Mehta A. D., & Mesgarani N. (2024). Contextual feature extraction hierarchies converge in large language models and the brain. Nature Machine Intelligence, 6(10), 1467-1477. https://doi.org/10.1038/s42256-024-00925-4 [58] Morreale A., Stoklasa J., Collan M., & Lo Nigro G. (2018). Uncertain outcome presentations bias decisions: Experimental evidence from Finland and Italy. [59] Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. [60] Park, P. S. (2024). Diminished diversity-of-thought in a standard large language model. [61] Pascal, B. (1670). [62] Pavey, L., & Churchill, S. (2014). Promoting the avoidance of high-calorie snacks: Priming autonomy moderates message framing effects. [63] Pawel S., Consonni G., & Held L. (2023). Bayesian approaches to designing replication studies. [64] Peng L., Guo Y., & Hu D. (2021). Information framing effect on public’s intention to receive the COVID-19 vaccination in China. [65] Peters, E., & Levin, I. P. (2008). Dissecting the risky-choice framing effect: Numeracy as an individual-difference factor in weighting risky and riskless options. [66] Popovic N. F., Pachur T., & Gaissmaier W. (2019). The gap between medical and monetary choices under risk persists in decisions for others. [67] Prescott M. R., Yeager S., Ham L., Saldana C. D. R., Serrano V., Narez J., … Montoya J. (2024). Comparing the efficacy and efficiency of human and generative AI: Qualitative thematic analyses. [68] Qin X., Huang M., & Ding J. (2024). AITurk: Using ChatGPT for social science research. [69] Redelmeier, D. A., & Tversky, A. (1992). On the framing of multiple prospects. [70] Reeck C., Mullette-Gillman O. A., McLaurin R. E., & Huettel S. A. (2022). Beyond money: Risk preferences across both economic and non-economic contexts predict financial decisions. [71] Salles A., Evers K.,& Farisco, M.(2020). Anthropomorphism in AI. [72] Samuelson, P. A. (1963). Risk and uncertainty: A fallacy of large numbers. [73] Scarffe A., Coates A., Brand K., & Michalowski W. (2024). Decision threshold models in medical decision making: A scoping literature review. [74] Shahid N., Rappon T., & Berta W. (2019). Applications of artificial neural networks in health care organizational decision-making: A scoping review. [75] Simonsohn, U. (2015). Small telescopes: Detectability and the evaluation of replication results. [76] Strachan J. W. A., Albergo D., Borghini G., Pansardi O., Scaliti E., Gupta S., … Becchio C. (2024). Testing theory of mind in large language models and humans. [77] Sun H. Y., Rao L. L., Zhou K.,& Li, S.(2014). Formulating an emergency plan based on expectation-maximization is one thing, but applying it to a single case is another. [78] Suri G., Slater L. R., Ziaee A., & Nguyen M. (2024). Do large language models show decision heuristics similar to humans? A case study using GPT-3.5. [79] Tabachnick, B. G., & Fidell, L. S. (2007). [80] Thapa, S., & Adhikari, S. (2023). ChatGPT, Bard, and large language models for biomedical research: Opportunities and pitfalls. Annals of Biomedical Engineering, 51(12), 2647-2651. https://doi.org/10.1007/s10439-023-03284-0 [81] Tversky, A., & Bar-Hillel, M. (1983). Risk: The long and the short. [82] Von Neumann, J., & Morgenstern, O. (1947). Theory of games and economic behavior (2nd rev. ed.). Princeton University Press.. [83] Wang Y., Zhang J., Wang F., Xu W.,& Liu, W.(2023). Do not think any virtue trivial, and thus neglect it: Serial mediating role of social mindfulness and perspective taking. [王伊萌, 张敬敏, 汪凤炎, 许文涛, 刘维婷.(2023). 勿以善小而不为: 正念与智慧——社会善念与观点采择的链式中介. [84] Webb T., Holyoak K. J., & Lu H. (2023). Emergent analogical reasoning in large language models. [85] Weber E. U., Blais A.-R., & Betz N. E. (2002). A domain- specific risk-attitude scale: Measuring risk perceptions and risk behaviors. [86] Wei J., Wang X., Schuurmans D., Bosma M., Ichter B., Xia F., .. Zhou D. (2022). Chain-of-thought prompting elicits reasoning in large language models. [87] Xia D., Li Y., He Y., Zhang T., Wang Y.,& Gu, J.(2019). Exploring the role of cultural individualism and collectivism on public acceptance of nuclear energy. [88] Xia D., Song M.,& Zhu, T.(2025). A comparison of the persuasiveness of human and ChatGPT generated pro- vaccine messages for HPV. [89] Yuan Y., Jiao W., Wang W., Huang J. T., He P., Shi S., & Tu Z. (2023). Gpt-4 is too smart to be safe: Stealthy chat with llms via cipher. [90] Zhang J., Li H. A.,& Allenby, G. M.(2024). Using text analysis in parallel mediation analysis. [91] Zhang Y., Huang F., Mo L., Liu X., & Zhu T. (2025). Suicidal ideation data augmentation and recognition technology based on large language models. [章彦博, 黄峰, 莫柳铃, 刘晓倩, 朱廷劭. (2025). 基于大语言模型的自杀意念文本数据增强与识别技术. [92] Zhao F., Yu F.,& Shang, Y.(2024). A new method supporting qualitative data analysis through prompt generation for inductive coding. [93] Zhao W. X., Zhou K., Li J., Tang T., Wang X., Hou Y., .. Wen J. R. (2023). A survey of large language models. |
| [1] | WANG Jianshu, JIANG Xiaowei, CHEN Yanan, WANG Minghui, DU Feng. From overt deterrence to covert internalization: Moral effects of AI regulation and the moderating role of personality traits [J]. Acta Psychologica Sinica, 2026, 58(3): 381-398. |
| [2] | DAI Yiqing, MA Xinming, WU Zhen. LLMs amplify gendered empathy stereotypes and influence major and career recommendations [J]. Acta Psychologica Sinica, 2026, 58(3): 399-415. |
| [3] | ZHU Naping, ZHANG Xia, ZHOU Jie, LI Yanfang. The development and motivations of children’s third-party intervention preference in group cooperation norm violation [J]. Acta Psychologica Sinica, 2026, 58(3): 516-533. |
| [4] | YANG Shen-Long, HU Xiaoyong, GUO Yongyu. The psychological impact of economic situation: Intervention strategies and governance implications [J]. Acta Psychologica Sinica, 2026, 58(2): 191-197. |
| [5] | WU Shiyu, WANG Yiyun. “Zero-Shot Language Learning”: Can Large Language Models “Acquire” Contextual Emotion Like Humans? [J]. Acta Psychologica Sinica, 2026, 58(2): 308-322. |
| [6] | JIAO Liying, LI Chang-Jin, CHEN Zhen, XU Hengbin, XU Yan. When AI “possesses” personality: Roles of good and evil personalities influence moral judgment in large language models [J]. Acta Psychologica Sinica, 2025, 57(6): 929-946. |
| [7] | GAO Chenghai, DANG Baobao, WANG Bingjie, WU Michael Shengtao. The linguistic strength and weakness of artificial intelligence: A comparison between Large Language Model (s) and real students in the Chinese context [J]. Acta Psychologica Sinica, 2025, 57(6): 947-966. |
| [8] | ZHANG Yanbo, HUANG Feng, MO Liuling, LIU Xiaoqian, ZHU Tingshao. Suicidal ideation data augmentation and recognition technology based on large language models [J]. Acta Psychologica Sinica, 2025, 57(6): 987-1000. |
| [9] | HUANG Feng, DING Huimin, LI Sijia, HAN Nuo, DI Yazheng, LIU Xiaoqian, ZHAO Nan, LI Linyan, ZHU Tingshao. Self-help AI psychological counseling system based on large language models and its effectiveness evaluation [J]. Acta Psychologica Sinica, 2025, 57(11): 2022-2042. |
| [10] | XIN Ziqiang, WANG Luxiao, LI Yue. Differences in information processing between experienced investors and novices, and intervention in fund investment decision-making [J]. Acta Psychologica Sinica, 2024, 56(6): 799-813. |
| [11] | TANG Meihui, TIAN Shuwan, XIE Tian. Beyond the myth of slimming: The impact of social norms on positive body image and caloric intake among young adults [J]. Acta Psychologica Sinica, 2024, 56(10): 1367-1383. |
| [12] | REN Zhihong, ZHAO Chunxiao, TIAN Fan, YAN Yupeng, LI Danyang, ZHAO Ziyi, TAN Mengling, JIANG Guangrong. Meta-analysis of the effect of mental health literacy intervention in Chinese people [J]. Acta Psychologica Sinica, 2020, 52(4): 497-512. |
| [13] | LI Aimei, WANG Haixia, SUN Hailong, XIONG Guanxing, YANG Shaoli . The nudge effect of “foresight for the future of our children”: Pregnancy and environmental intertemporal choice [J]. Acta Psychologica Sinica, 2018, 50(8): 858-867. |
| [14] | REN Zhihong, Zhang Yawen, JIANG Guangrong. Effectiveness of mindfulness meditation in intervention for anxiety: A meta-analysis [J]. Acta Psychologica Sinica, 2018, 50(3): 283-305. |
| [15] | REN Zhihong; LI Xianyun; ZHAO Lingbo; YU Xianglian; LI Zhenghan; LAI Lizu; RUAN Yijun; JIANG Guangrong. Effectiveness and mechanism of internet-based self-help intervention for depression: The Chinese version of MoodGYM [J]. Acta Psychologica Sinica, 2016, 48(7): 818-832. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||